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Abstract 

We examine theoretic architectures and an abstract model for a restricted class of 
quantum computation, called here temporally unstructured ("instantaneous") quantum 
computation because it allows for essentially no temporal structure within the quantum 
dynamics. Using the theory of binary matroids, we argue that the paradigm is rich 
enough to enable sampling from probability distributions that cannot, classically, be 
sampled from efhciently and accurately. This paradigm also admits simple interactive 
proof games that may convince a skeptic of the existence of truly quantum effects. Fur- 
thermore, these effects can be created using significantly fewer qubits than are required 
for running Shor's Algorithm. 



1 Introduction 

1.1 Mathematical motivation 

It has often been said that underlying the power of quantum computers is the close con- 
nection between the computational model and the way we represent dynamics in quantum 
systems. This connection is implicit in the standard circuit model, where we require a uni- 
versal gate set for an n-logical-qubit processor to be capable of simulating the dynamics of 
the n-qubit unitary group 5[/(2"). While there are many equivalent models of (universal) 
quantum computing, and not all of them explicitly 'generate' the special unitary group on 
n qubits, they each simulate (to within some pre-defined precision) operations drawn from 
some non-abelian unitary group on a set of qubits. Our approach in this paper departs from 
this well trod path, by focussing almost exclusively on an abelian subgroup of the unitary 
group. This approach is much more restrictive in the kinds of computation allowed, leading 
to a computational paradigm that lies somewhere between classical and universal quantum 
computing. 

The non-abelian nature of quantum circuit elements is undoubtedly a crucial feature of 
universal quantum computing; for example, it imposes a clear physical limitation to the 
time-ordering of the gates in a circuit. In the standard model of quantum computation, 
the only circuits that can be performed in a single "time-step" are those composed only of 
single-qubit gates and two-qubit gates that act on disjoint sets of qubits. We often refer 
to such circuits as depth-1 circuits. When an abelian group is being used for the gates 
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within a circuit, that circuit need not be depth-1 in the sense just described, though it will 
nonetheless be essentially devoid of temporal structure, since the order of the gates is im- 
material. Physically, the quantum circuit model can be interpreted as applying a controlled 
sequence of unitary operations, which can in turn be thought of as a sequence of Hamilto- 
nian evolutions. If any two consecutive gates in a sequence commute with one another, then 
their order in the sequence can be freely interchanged, or equivalently, their Hamiltonians 
can simply be combined additively, which corresponds to simultaneous evolution. When all 
gates commute, a single simultaneous Hamiltonian evolution describes the dynamics, whose 
terms are the individual gates. 

1.2 Physical motivation 

How can we tell when we have successfully built a quantum computer? Given that tomog- 
raphy quickly becomes difficult as the number of qubits in a system grows, it is pertinent 
to ask if there is a simpler way of verifying the success of a quantum computation. One 
way, which has already been attempted in several experiments, e.g. [3 El [H [18] , would be 
to use the prototype quantum computer to find the solution to a problem which we think 
is difficult to solve on a classical computer. 

For instance, the following scenario is generic. Alice is a skeptic, she doesn't believe that 
Bob has a quantum device at his disposal. Fortunately, she is relatively certain that clas- 
sical computers can't efficiently find the prime factors of a large integer, whereas quantum 
computers can [16] (although many qubits may be required for a convincing demonstration). 
So she issues a challenge to Bob : she chooses a large number for which she cannot find the 
prime factors and sends it to Bob. If Bob then sends back the prime factors of her number 
within a reasonable time period, she can easily convince herself that Bob must have had a 
quantum device at his disposal. 

This scenario in particular is one which has been used in attempts to verify the success of 
several small-scale quantum computers [71[6l[8], though of course the numbers used were too 
small to be considered hard to factor classically. Unfortunately, so far as we know, Shor's 
factoring algorithm is a relatively difficult quantum algorithm to perform. It is well known 
that it can be implemented in a circuit model using polynomial circuit depth and linear 
circuit width, or logarithmic depth with a larger width [3j, or even with constant depth if 
arbitrarily wide 'fanout' gates are allowed [1]. In either case, we'd apparently require a fully 
universal set of quantum gates, and more than a thousand logical qubits, for a convincing 
demonstration. 

In this paper, we use abelian dynamics to suggest a two-party protocol (with classical 
message-passing), which could be used to test remotely a quantum device, which we be- 
lieve is physically far less complex than factorization. We conjecture that it is classically 
infeasible to simulate the quantum process in our protocol, and that our protocol is simpler 
to implement than all known versions of Shor's algorithm, not requiring anything like a 
universal gate set. 
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1.3 Guide to the paper 

We introduce the paradigm IQP, which stands for "Instantaneous Quantum Polytime". It 
is a restricted model of quantum computation, which can also be thought of in terms of an 
oracle for computation. Here 'polytime' means that the process is bound to consume at most 
a polynomial amount of resource in any reasonable model of quantum computation, while 
'instantaneous' means that the algorithmic process itself is to contain no inherent temporal 
structure. We give formal definitions, and argue that there are non-trivial applications for 
IQP. In particular, a two-party interactive protocol game is described and discussed. 

These are our main points, to bear in mind throughout the paper : 

• We define a restricted model of quantum computation, called IQP (section [2]) . 

• We present several different quantum architectures (section [5|) that can render com- 
putations in the IQP model, suggesting that it is a 'lowest common denominator' of 
some 'natural' ideas for computing based on 'abelian' notions. 

• Although we know of no specific architecture where the Hamiltonians and mea- 
surements involved in the IQP paradigm really are genuinely 'easy' to implement, 
nonetheless there is a clear sense in which the mathematics that underlies the compu- 
tation is 'easy'. Perhaps the "Graph State" architecture (section [5]) gives the clearest 
example of a practical computing idea. 

• We argue and conjecture that the probability distributions generated in the IQP 
paradigm (section [2|) are not only classically hard to sample from approximately, but 
that there are actual polynomial-time protocols (section [3]) that can be completed 
using an IQP oracle that (we believe) can't be completed classically in polynomial 
time. 

• We provide an explicit example of such a protocol, involving two parties. The purpose 
of the protocol is simply for one party to prove to the other that they are capable 
of approximating a multi-qubit output distribution having characteristics that match 
the output distribution of a particular IQP process. In the protocol, Alice designs a 
problem with a 'hidden' property, and sends it to Bob; Bob runs the problem through 
his IQP oracle several times, and sends the classical outputs back to Alice; Alice then 
uses the secret 'hidden' property to assess whether there is good evidence for Bob 
having used a real IQP oracle. This is all done in section [3l 

• We make a pragmatic analysis of our suggested protocol, showing how to fine-tune 
its parameters in order to make plausible the conjecture that it really can't be 'faked' 
classically (section [4]) . 

• By analogy, this protocol is to quantum computation what Bell experiments are to 
quantum communication : the simplest known 'proof of a distinctly quantum phe- 
nomenon. (Of course, since there is no mathematical proof published to date of a 
separation between the power of quantum computation and classical computation, we 
still have to rely on certain computational hardness conjectures.) 

• Despite the existence of protocols apparently requiring an IQP oracle, we are unable 
to find any decision language in BPP^*^^ that is not in BPP. There seems to be a 
sense in which the paradigm isn't able to 'compute new information' (section . 

• If there should one day be architectures that can implement IQP oracles — even 
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though full-blown universal quantum computing remain an unresolved engineering 
challenge — then our protocol may be an important demonstrator of the power of 
quantum mechanics for quantum computing. 

Much of the mathematics used depends significantly on the theory of binary matroids and 
binary linear codes, and so we spend some time in section [2] recalling some basic definitions. 
Readers interested in the main construction (the interactive game) should start at section [3] 
and dip back into the earlier definitions where needed. Pure mathematicians might prefer 
to start at section 13.3.31 cryptographers might particularly appreciate section [H whereas 
physicists may prefer section [5j 



2 The IQP paradigm 

In this section, we define what we call the "X-program" architecture, and use it to define 
the notion of IQP oracle. The rest of the paper depends heavily on these definitions. Note 
that the X-program architecture is not particularly 'physical', but is easier to work with 
than the more physically relevant architectures discussed in section [5l 



2.1 X-programs 

Recall that a Pauli X operator acts on a single qubit, exchanging | 0) with 1 1), i.e. X = 
|0)(1 1)(0|. One can also think of X as a Hamiltonian term, since X oc exp(z^X) . An 
X-program is essentially a Hamiltonian that is a sum of products of Xs acting on different 
qubits. 

In this architecture, allow for a set of n qubits, initialised into the pure separable compu- 
tational basis state |0). The X-program is specified as a (polysize) list of pairs (^p,p) € 
[0, 27r] X F2, so 6p is an angle and p is a string of n bits. Each such program element 
(pair) is interpreted as the action of a Hamiltonian on the qubits indicated by p, applied for 
actioij^ 0p : the Hamiltonian to apply is made up from a product of Pauli X operators on 
the indicated qubits, and naturally these all commute. This means that — in principle — the 
program elements could all be applied simultaneously : their time ordering is irrelevant. 
The measurement to be performed, once all the Hamiltonians have been applied, is simply 
a computational-basis measurement, and the program output is simply that measurement 
result, regarded as a (probabilistic) sample from the vectorspace F2 . 

Combining this together, we see that the probability distribution for such an output is 



F(X = x) 



(x| exp J]i^p (g) X, |0") 



(1) 



The output string here is labelled x. The random variable X here (and throughout) codifies 
this probability distribution of classical output samples. 

For almost all of our purposes, we are only interested in the case where the 9^ action values 
are the same for every term. When that condition applies, and the value 9 is specified, the 
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entire X-program can be represented using a poly{n)-hy-n binary matrix, with each row 
corresponding to a term in the Hamiltonian. For example, the 7-by-4 binary matrix 



P = 



/ 1 \ 
110 
110 
10 11 
10 1 
10 

V 1 / 



(2) 



would represent the 4-qubit Hamiltonian 



9 ■ {Xi + X1X2 + X2X3 + XiX^X^ + X2X4 + X3 + X4). 



(3) 



2.2 IQP oracle 



Definition 1 On input the explicit description of an X-program, we define an IQP oracle 
to be any computational method that efficiently returns a sample string from the probability 
distribution as given at line (1^. 

For a formal definition of what is meant by an IQP oracle, let it be any device that interfaces 
to a probabilistic Turing machine via an 'oracle tape', so that if the oracle tape holds a 
description of a particular X-program (P, 6 in the 'constant action' case) at the time when 
the Turing machine calls its 'implement oracle' instruction, then in unit time (or perhaps 
in time polynomial in the length of the description of P, i.e. polynomial in n), a bitstring 
sample in F2 from the probability distribution at line ([1]) is created and written to the 
oracle tape, and control is passed back to the Turing machine to continue processing. 

We write the overall paradigm — of classical computation augmented by this oracle — as 
gppiQP^ to denote the fact that classical randomised polytime pre- and post-processing 
is usually to be considered allowed in a simulation, and to denote the fact that we don't 
much care which of several quantum architectures might be being used to supply the 'IQP- 
power' of sampling from probability distributions of the form at line ([T]). This notation is 
not necessarily supposed to indicate a particular class of decision languages as such, but 
rather a particular class of computations. The interactive proof games in section [3] require 
the Prover to have access to an IQP oracle, and to access it a polynomial number of times, 
though these calls may be made in parallel and without precomputation. 

Lemma 1 The probability distribution given at line (Cp is equivalent to the one given below. 



Proof : Let P denote the A;-by-n binary matrix whose rows are the p vectors of the 
X-program under consideration. Then using the fact that the Hamiltonian terms in an 
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F(X = x) 




I sm 



p 



(4) 
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X-program all commute, we can think of the quantum amplitudes arising in an X-program 
implementation as a sum over paths, 




n 

= (^1 E n ^°^^p n ^sin^p n^f''^' io")> (5) 

agpfc p : ap=0 p : ap=l j=l 

and hence derive a new form for the probability distribution accordingly. ■ 
2.3 Binary matroids and Linear binary codes 

Before proceeding to the main topics of the paper, it behooves us to establish the link that 
these formulas have with the (closely related) theories of binary matroids and binary linear 
codes. 

Definition 2 A linear binary code, C, of length k is a (linear) subspace of the vectorspace 
W2, represented explicitly. The elements ofC are called codewords, and the Hamming weight 
wt{c) G [0..k] of some c £ C is defined to be the number of Is it has. The rank of C is its 
rank as a vectorspace. 

Linear binary codes are frequently presented using generator matrices, where the columns 
of the generator matrix form a basis for the code. If P is a generator matrix for a rank r 
code C, then P has r columns and the codewords are 

There are many different, isomorphic, definitions for matroids, [10]. We shall adopt the 
following definition. 

Definition 3 A k-point binary matroid is an equivalence class of matrices defined over 
F2, where each matrix in the equivalence class has exactly k rows, and two matrices are 
equivalent (written Afi ~ M2) when for some (k-by-k) permutation matrix Q, the column- 
echelon reduced form of Mi is the same as the column- echelon reduced form ofQ-M2. Here 
we take column- echelon reduction to delete empty columns, so that the result is full-rank. 
Hence the rank of a matroid is the rank of any of its representatives. 

Less formally, this means that a binary matroid is like a matrix over F2 that doesn't notice 
if you rearrange its rows, if you add one of its columns into another (modulo 2), or if you 
duplicate one of its columns. This means that a matroid is like the generator matrix for a 
linear binary code, but it doesn't mind if it contains redundancy in its spanning set {i.e. 
has more columns than its rank) and it doesn't care about the actual order of the zeroes 
and ones in the individual codewords. To be clear, when thinking of a matrix such as P in 
line ([2]) , we are simultaneously thinking of its columns as the elements of a spanning set for 
a code, and its rows as the points of a corresponding matroid. Because one cannot express a 
matroid independently of a representation, we consistently conflate notation for the matrix 
P with the matroid P that it represents. 

Perhaps the main structural feature of a binary matroid is its weight enumerator polyno- 
mial. 
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Definition 4 // the k rows of binary matrix P establish the points of a k-point matroid, 
then the weight enumerator of the matroid is defined to be the weight enumerator of the 
k-long code C spanned by the columns of P, which in turn is defined to be the bivariate 
polynomial 

WEPc{x,y) = ^^«'t(c)yfe-^t(c)^ (g^ 



This is well-defined, because the effect of choosing a different matrix P that represents 
the same binary matroid simply leads to an isomorphic code that has the same weight- 
enumerator polynomial as the original code C. 



2.4 Bias in probability distributions 

Definition 5 If ^ is a random variable taking values in F2 , and s is any element of F2 , 



then the bias of X in direction s is simply the probability that X 
probability of a sample being orthogonal to s. 



IS zero, I.e. 



the 



Let us now consider an X-program on n qubits that has constant action value 9, whose 
Hamiltonian terms are specified by the rows of matrix P, as discussed earlier. Then we can 
use lemma [T] to obtain the following expression of bias, for any binary vector s G F2 : 



F(X' 



0) 



k—wt(a) 



{i sin 9) 



wt{a) 



(7) 



aP= 



Since it would obviously be nice to interpret this expression as the evaluation of a weight 
enumerator polynomial, we are led to define Pg to be the submatrix of P obtained by 
deleting all rows p for which p • = 0, leaving only those rows for which p • S"^ = 1. We 
caljl the number of rows remaining ng. This in turn leads to the code Cg being the span 
of the columns of Ps, and likewise a submatroicH is correspondingly defined. 

Theorem 1 When considering constant- action X-programs, the bias expression F(X-S"^ = 
0) for the random variable X introduced at line (CP depends only on the action value 9 and 
(the weight enumerator polynomial of) the Ug-point matroid Pg, as defined above. Moreover, 
if Cg is a binary code representing the matroid Pg, then the following formul^ expresses the 
bias : 

F(X-s^ = 0) = Ec^Cs 



cos 



9{ng - 2 • wt{c)) j 



(8) 



Proof : See the appendix for a proof. ■ 

^ris is here being used for the length of the code Ca in deference to the usual practice of reserving the 
letter n for code lengths. This ris is counting a number of rows, and should not be confused with the n used 
earlier for counting a number of columns. 

^also called a matroid minor 

^Subscripts on expectation operators indicate a variable ranging uniformly over its natural domain. 
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To recap, this means that if we run an X-program using the action value 9 for all program 
elements, then the probability of the returned sample being orthogonal to an s of our 
choosing ('orthogonal' in the F2 sense of having zero dot-product with s) depends only on 
6 and on the (weight enumerator polynomial of the) linear code obtained by writing the 
program elements p as rows of a matrix and ignoring those that are orthogonal to s. 

There is a definition in the literature for weighted matroids, which in this context would 
correspond to allowing different 9 values for different terms in the Hamiltonian of an X- 
program. While mathematically (and physically) natural, such considerations would not 
help with the clarity of our presentation. 

We emphasise at this point the value of theorem [T] : it means that for any direction s € F2, 
the bias of the output probability distribution from an X-program (P, 9) in the direction s 
depends only on 9 and the rows of P that are not orthogonal to s, and not at all on the 
rows of P that are orthogonal to s. Moreover, the bias in direction s depends only on the 
matroid Pg, and not on the particular matrix Pg that represents it. That is, directional bias 
(definition E]) is a matroid invariant. 

Note that whenever A is an n-by-n invertible matrix over F2, then p • = p ■ A ■ A^^ ■ 
= (p • yl) • (s • A^'^)'^ , so any invertible column operation on matrix P accompanies an 
invertible change of basis for the set of directions of which s is a member. Note also that 
appending or removing an all zero column to P has the effect of including or excluding a 
qubit on which no unitary transformations are performed. Thus if Pg is a submatroid of 
P by point-deletion, as described earlier, then if the invertible column transformation A is 
applied to the matrix P that represents the matroid P, then the same matroid that was 
formerly called Pg is still a submatroid, but now it is represented by the matrix -Pg.^-T. 
Likewise, appending or removing a column of zeroes to P necessitates an extra zero be 
appended or removed from any s that serves as a direction for indicating a submatroid. 
This is purely an issue of representation, and we consider that intuition about these objects 
is aided by taking an 'abstractist' approach to the geometry. 

2.5 Entropy, and trivial cases 

Because it will be useful later, we will define the Renyi entropy (collision entropy) of a 
random variable, before exemplifying theorem [T] and proceeding with the main construction 
of the paper. 

Definition 6 The collision entropy, S2, of a discrete random variable, X, measures the 
randomness of the sampling process by measuring the likelihood of two (independent) samples 
being the same. It is defined by 



And so there are a few 'easy cases' for our X random variable of lemma [T] that should be 
highlighted and dismissed up front : 

Lemma 2 For a constant- action X-program, if 9 is. . . 




X 



(9) 
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• ... a multiple of it, then the returned sample will always be 0. The collision entropy 
will be zero. 

• ...an odd multiple of 7^/2, then the returned sample will always be X^pgpP- The 
collision entropy is zero. 

• . . . an odd multiple of it /A, then the collision entropy need not be zero, but the proba- 
bility distribution will be classically simulable to full precision. 

Proof : In the first case, considering line (jl]), there is then a sin(7r) = factor in every 
term of the probability, except where x = 0. 

In the second case, considering again line ([4]), there is then a cos(7r/2) = factor in every 
term, except where all the p vectors are summed together to give x. The same can also be 
deduced from theorem [H which implies that x will be surely orthogonal to s exactly when 
Us is even, i.e. exactly when an even number of rows of P are not orthogonal to s, i.e. 
exactly when X^pgp p is orthogonal to s. 

For the third case, if 9 is an odd multiple of 7r/4, then all the gates in the program would be 
Clifford gates. By the Gottesman-Knill theorem there is then a classically efficient method 
for sampling from the distribution, by tracking the evolution of the system using stabilisers, 
etc. ■ 

For other sufficiently different values of the action parameter, classical intractibility becomes 
a plausible conjecture (c/. [EUll]). In particular, the remainder of this paper will specialise 
to the case 9 = tt/S, since we are able to make all our points about the utility of IQP even 
with this restriction. 



3 Interactive protocol 

One would naturally like to find some 'use' for the ability to sample from the probability 
distribution that arises from a temporally unstructured quantum polytime computation; 
a 'task' or 'proof that can be completed using e.g. an X-program, which could not be 
completed by purely classical means. In this section we develop our main construction; 
a two-player interactive protocol game in which a Prover uses an IQP oracle simply to 
demonstrate that he does have access to an IQP oracle. 

3.1 At a glance 

There are three aspects of design involved in specifying an actual "Alice &: Bob" game : 

A) a code/matroid construction, for Alice to select a problem, to send to Bob, 

B) an architecture or technique by which Bob to take samples from the IQP distribution 
of the challenge he receives, to send back to Alice, 

A') an hypothesis test for Alice to use to verify (or reject) Bob's attempt. 

We have already defined the X-program architecture, and in section [5] we discuss some 
alternatives that Bob might like to try. In section [J] we make an analysis of some classical 
cheating strategies for Bob, in case he can't lay his hands on a quantum computer. The 
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details of precisely how to make a good hypothesis test are omitted from this paper for the 
sake of brevity, but sourcecode is available on our website (see section I3.4p . 

Alice plays the role of the Challenger /Verifier, while Bob plays the role of the Prover. Alice 
uses secret random data to obfuscate a 'causal' matroid Pg inside a larger matroid P, and 
the latter she publishes (as a matrix) to Bob. Bob interprets matrix P as an X-program 
to be run several times, with 9 = vr/S. He collects the returned samples, and sends them 
to Alice. Alice then uses her secret knowledge of 'where' in P the special Pg matroid is 
hidden, in order to run a statistical test on Bob's data, to validate or refute the notion that 
Bob has the ability to run X-programs. 

This application is perhaps the simplest known protocol, requiring (say) ~ 200 qubits, that 
could be expected to convince a skeptic of the existence of some computational quantum 
effect. The reason for this is that there seems to be no classical method to fake even a 
classical transcript of a run of the interactive game between Challenger and Prover, without 
actually being (or subverting the secret random data of) the classical Challenger. 

3.2 Concept overview 

Consider therefore the following game, between Alice and Bob. Alice, also called the Chal- 
lenger/Verifier, is a classical player with access to a private random number generator. Bob, 
also called the Prover, is a supposedly quantum player, whose goal is to convince Alice that 
he can access an IQP oracle, i.e. run X-programs. The rules of this game are that he has to 
convince her simply by sending classical data, and so in effect Bob offers to act as a remote 
IQP oracle for Alice, while Alice is initially skeptical of Bob's true IQP abilities. 

3.2.1 Alice's challenge 

The game begins with Alice choosing some code Cg that has certain properties amenable to 
her analysis. She chooses the code Cg in such a way that there is a for which the (quantum) 
expectation value at line dH]) of theorem [1] is somewhere well within (^, 1), and for which 
the corresponding expectation value that arises from the best-known classical approaches 
to 'cheating' {e.g. presumably the one at line (j20p of section 14.41 in case 6 = tt/S) is 
significantly smaller. 

She then finds a matrix Pg whose columns generate the code (not necessarily as a basis), 
and ensures that there is some s that is not orthogonal to any of the rows of Pg . The vector 
s should be thought of not as a structural property of the code Cg, but as a 'locator' that 
can be used to 'pinpoint' Pg even after is has later been obfuscated. 

Obfuscation of Pg is achieved by appending arbitrary rows that are orthogonal to s. This 
gives rise to matrix P. The matroid P has Pg as a submatroid, in the sense that removal of 
the correct set of rows will recover Pg . Alice publishes to Bob a representation of matroid P 
that hides the structure that she has embedded. Random row permutations are appropriate, 
and reversible column operations likewise leave the matroid invariant (though the latter will 
affect s and must therefore be tracked by Alice). 
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3.2.2 Bob's proof 



Bob, being BPP^'^^ -capable by hypothesis, may interpret the pubhshed P as an X- 
program, to be run with the (constant) action set to 9 = tt/S (say). He wiU be able to 
generate random vectors which independently have the correct bias in the (unknown to 
him) direction s, i.e. the correct probability of being orthogonal to Alice's secret s, in 
accordance with theorem [TJ Although he may still be entirely unable to recover this s from 
such samples, he nonetheless can send to Alice a list of these samples as proof that he is 
BPpiQP.capable. 

Note that Bob's strategy is error-tolerant, because if each run of the IQP algorithm were 
to use a 'noisy' 6 value, then the overall proof that he generates will still be valid, providing 
the noise is small and unbiased and independent between runs. Note also that he can 
manage several runs in one oracle call, if desired, simply by concatenating the matrix P 
with itself diagonally. That is to say, we even avoid classical temporal structure (adaptive 
feed-forward) on Bob's part. 

3.2.3 Alice's verification 

Since Alice knows the secret value s, and can presumably compute the value F(X ■ s^ = 0) 
from the code's weight enumerator polynomial (see theorem [T] and recall that she is free to 
choose any Cg that suits her purpose), it is not hard for her to use a hypothesis test to confirm 
that the samples Bob sends are commensurate with having been sampled independently from 
the same distribution that an X-program generates. That is to say, Alice will not try to test 
whether Bob's data definitely fits the correct IQP distribution, but she will ensure that it 
has the particular characteristic of a strong bias in the secret direction s. This enables her 
to test the null hypothesis that Bob is cheating, from the alternative hypothesis that Bob 
has non-trivial quantum computational power. 

This requires belief in several conjectures on Alice's part. She must believe that there is 
a classical separation between quantum and classical computing; in particular that IQP 
is not classically efficiently approximately simulable — at least she must believe that Bob 
doesn't know any good simulation tricks. She must believe that her problem is hard — at 
least she should believe that the problem of identifying the location of Pg within P is not 
a BPP problem — on the assumption that the matroid Pg is known. 

If he passes her hypothesis test. Bob will have 'proved' to Alice that he ran a quantum 
computation on her program, provided she is confident that there is no feasible way for 
Bob to simulate the 'proof data classically efficiently, i.e. provided she has performed 
her hypothesis test correctly against a plausibly best null hypothesis. In particular, Alice 
should ensure a large collision entropy for the true (IQP) distribution, since she will want 
to remove all 'short circuits' (i.e. all the empty rows and all the duplicate rows) from Bob's 
data, before testing it, to make a test that is both fair and efficient. Otherwise it would be 
too easy for Bob to generate a set of data that has a strong bias in very many directions 
simultaneously; and it would be tedious for Alice to confirm that he has not cheated in this 
way if she did not remove the short circuits. 
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3.2.4 Significance 



This kind of interactive game could be of much significance to vahdation of early quantum 
computing architectures, since it gives rise to a simple way of 'tomographically ascertaining' 
the actual presence of at least some quantum computing, modulo some relatively basic 
complexity assumptions. In this sense it is to quantum computation what Bell violation 
experiments are to quantum communication 

Of course, this protocol really comes into its own when the architecture being tested happens 
to have the undesirable engineering feature of being unable to sustain long-term quantum 
coherence, and therefore perhaps only ever being capable of shallow-depth computation. 
Unfortunately, the prescription for X-programs given in section [2] requires Hamiltonian 
terms that act across potentially hundreds of qubits, and the alternative architectures dis- 
cussed in section [5] have similar physical drawbacks that still make this paradigm extremely 
challenging for today's engineering. 

Note that this 'testing concept' does not use the IQP paradigm to compute any data that 
is unknown to everyone, nor does it directly provide Bob with any 'secret' data that could 
be used as a witness to validate an NP language membership claim. Its only effect is to 
provide Bob with data that he can't use for any purpose other than to pass on to Alice 
as a 'proof of IQP-capability. It is an open problem to find something more commonly 
associated with computation — perhaps deciding a decision language, for example — that can 
be achieved specifically by the BPP^*^^ paradigm. 

3.3 Recommended construction method 

Here is a specific example of a construction methodology (with implicit test methodology) 
for Alice, which we conjecture to be asymptotically secure (against cheating Prover) and 
efficient (for both Prover and Verifier). 

3.3.1 Recipe for codes 

The family of codes that we suggest Alice should employ within the context of the game 
outlined above are the quadratic residue codes. These will be shown to have the significant 
property that there is a non-negligible gap between the quantum- and best-known-classical- 
approximation expectation values for the bias in the secret direction, both of which are 
significantly below 1. (The bias for a truly quantum-enabled Bob has already been defined 
in theorem [H in terms of the weight-enumerator polynomial of the causal code. For a 
classically cheating Bob, we discuss the best-known classical strategies and their biases in 
section 14.41 ) 

Consider a quadratic residue code over F2 with respect to the prime q, chosen so that 
g -|- 1 is a multiple of eight. The rank of such a code is {q + l)/2, and the length is q. A 
quadratic residue code is a cyclic code, and can be specified by a single cyclic generator. 
There are several ways of defining these, but the simplest definition is to take the codeword 

^We have serendipitously identified a construction for which the probability gap — quantum 85.4% over 
classical 75% — precisely matches the gap available in Bell's inequalities. See lemma [S] 
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that has a 1 in the jth place if and only if the Legendre symboH (^j equals 1, i.e. if 



and only if j is a non-zero quadratic residue modulo q. For example, if g = 7 (the 
smallest example) then the non-zero quadratic residues modulo q are {1,2,4}, and so the 
quadratic residue code in question is the rank-4 code spanned by the various rotations of the 
generator (0, 1, 1, 0, 1, 0, 0, 0)^. A basis for this code is found in the columns of the matrix 
at hue dSI). 

Lemma 3 When q is a prime and 8 divides q + l, then there is a unique quadratic residue 
code C (up to isomorphism) of length q over F2, having rank {q + l)/2, and it satisfies 



cos^f ^{q - 2-wt{c)) ) = cos^(7r/8) = 0.854... (10) 



Moreover, it also satisfies 

cf • C2 = I ci,C2 ~ C ) = 3/4 = 0.75, (11) 
which is relevant to certain classical strategies (section\4.4\)- 



Proof : The proof of the rank of the code and its uniqueness are well established results 
from classical coding theory [22]. Other classical results of coding theory include that 
quadratic residue codes are a parity-bit short of being self-dual and doubly even. That 
is, the extended quadratic code, with length q + 1, obtained by appending a single parity 
bit to each codeword, has every codeword weight a multiple of 4 and every two codewords 
orthogonal. 

For line ()10p this means that the (unextended) code has codeword weights which, modulo 
4, are half the time and half the time —1. On putting these values into the left side of 
the formula, we immediately obtain the right side. For line (lllh this means that in the 
(unextended) code, any two codewords are non-orthogonal if and only if they are both 
odd-parity, which happens a quarter of the time : from which the formula follows. ■ 

The corollary here is that if Alice uses one of these codes for her 'causal' Cg, then if Bob 
runs a series of X-programs (with constant 6 = tt/8) described by the (larger) matrix P, the 
data samples he recovers should be orthogonal to the hidden s about 85.4% of the time (c/. 
theorem [1]) ; whereas if Bob tries to cheat using the classical strategy outlined in section [J 
then his data samples will tend to be orthogonal to the hidden s only about 75% of the time 
(c/. lemma [6]). Alice's hypothesis test therefore basically consists in measuring this single 
characteristic, after having filtered duplicate and null data samples from Bob's dataset. We 
would conjecture that Bob has no pragmatic way of boosting these signals, at least not 
without feedback from Alice, or exponential resources. 

Note that with exponential time on his hands. Bob could choose to simulate classically an 
IQP oracle, in order to obtain a dataset with a bias in direction s that is approximately 
85.4%. Alternatively, he could consider every possible s in turn, and test to see whether 
the matroid obtained by deleting rows orthogonal to his guess is in fact correspondent to a 
quadratic residue code, assuming he knew that this had been Alice's strategy. 



®This Legendre symbol is equivalent, modulo q, to j^'' 
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3.3.2 Recipe for obfuscation 

Having chosen q as outlined above, and constructed a q-hy-{q+l) /2 binary matrix generating 
a quadratic residue code, Alice needs to obfuscate it. The easiest way to manage this process 
is not to start with a particular secret s in mind, but rather to recognise the obfuscation 
problem as a matroid problem, proceeding as follows : 

• Append a column of Is to the matrix : this docs not change the code spanned by 
its columns since the all-ones (full-weight) vector is always a codeword of a quadratic 
code. Other redundant column codewords may also be appended, if desired. 

• Append many (say q) extra rows to the matrix, each of which is random, subject to 
having a zero in the column lately appended. This gives rise to a 2qi-point matroid, 
and ensures that there now is an s such that the causal submatroid (quadratic residue 
matroid) is defined by non-orthogonality of the rows to that s. 

• Reorder the rows randomly. This has no effect on the matroid that the matrix rep- 
resents, nor on the hidden causal submatroid. Nor does it affect s, the 'direction' in 
which the sumbatroid is hidden. 

• Now column-reduce the matrix. There is no (desirable) structure within the particular 
form of the matrix before column-reduction, nothing that affects either codes or ma- 
troids. Echelon-reduction provides a canonical representative for the overall matroid, 
while stripping away any redundant columns that would otherwise cost an unnecessary 
qubit, when interpreted as an X-program. By providing a canonical representative, 
it closes down the possibility that information in Alice's original construction of a 
basis for her causal code might leak through to Bob, which might be useful to him in 
guessing s. Rather more importantly, this reduction actually serves to hide s. (We 
can be sure by zero-knowledge reasoning that this hiding process is random : eche- 
lon reduction is canonical and therefore supervenes any column-scrambling process, 
including a random one.) 

• Finally, one might sort the rows, though this is unnecessary. The resulting matrix is 
the one to publish. It will have at least {q + l)/2 columns, since that is the rank of 
the causal submatroid hidden inside. 

3.3.3 Mathematical problem description 

This method of obfuscation amounts to — mathematically speaking — a situation whereby 
for each suitable prime q, we start by acknowledging a particular (public) g'-point binary 
matroid Q, viz the one obtained from the QR-Code of length q. Then an 'instance' of 
the obfuscation consists of a published 2g-point (say) binary matroid P, and there is to be 
a hidden "obfuscation" subset O such that Q = P\0; and the practical instances occur 
with P chosen effectively at random, subject to these constraints. (One could choose to 
make O bigger than q points if that were desired.) This has the feel of a fairly generic 
hidden substructure problem, so it seems likely that it should be NP-hard to determine 
the location of the hidden Q, given P and the appropriate promise of Q's existence within. 
More syntactically, we should like to prove that it is NP-complete to decide the related 
matter of whether or not P is of the specified form, given only a matrix for P. Clearly 
this problem is in NP, since one could provide Q in the appropriate basis as witness. We 
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conjecture this problem to be NP-complete. 

Conjecture 1 The language of matroids P that contain a quadratic-residue code subma- 
troid Q by point deletion, where the size of Q is at least half the size of P, is NP-complete 
under polytime reductions. 

These sorts of conjecture are apparently independent of conjectures about hardness of clas- 
sical efficient IQP simulation, since they indicate that actually identifying the hidden data 
is hard, even for a universal quantum computer. Even should this conjecture prove false, we 
know of no reason to think that a quantum computer would be much better than a classical 
one at finding the hidden Q, notwithstanding Grover's quadratic speed-up for exhaustive 
search. 

One might compare the structure of conjectured] to that of the following important theorem 
from graph theory : 

Proposition 1 The language of graphs G that contain a complete graph K by vertex dele- 
tion, where the size of K is at least half the size of G, is NP-complete under polytime 
reductions. 

This is a classic result, see e.g. [llj, where the problem in different guises is called 'Clique' 
and 'Independent Set' and 'Node Cover'. 



3.4 Challenge 

It seems reasonable to conjecture that, using the methodology described, with a QR-code 
having a value q ~ 500, it is very easy to create randomised Interactive Game challenges for 
BPP^^^-capability, whose distributions have large entropy, which should lead to datasets 
that would be easy to validate and yet infeasible to forge without an IQP-capable com- 
puting device (or knowledge of the secret s vector). We propose such challenges as being 
appropriate 'targets' for early quantum architectures, since such challenges would essentially 
seem to be the simplest ones available (at least in terms of inherent temporal structure and 
number of qubits) that can't apparently be classically met. 



Accordingly, we have posted on the internet (http://quantumchallenges.wordpress.com) a 



$25 challenge problem, of size q = 487, to help motivate further study. This challenge 
website includes the source code (C) used to make the challenge matrix, and also the source 
code of the program that we will use to check candidate solutions, excluding only the secret 
seed value that we used to randomise the problem. 



4 Heuristics 

Next we address in more detail the reasons for thinking the problem classically intractable, 
and also give an accounting of our failure to find a decision language for proving the worth 
of IQP. 
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4.1 Hardness of strong simulation 

In support of the supposed complexity of this paradigm, Terhal and Divincenzo [H], and 
Aaronson p], have already showed that it is PP-complete to strongly simulate the generic 
probability distributions that arise hence. 

Lemma 4 It is P^'^^^ -hard to determine the numerical value o/F(X = 0) (as defined in 
section\^ line (OPJ to within exponential precision, for arbitrary matroids. 

Proof : Let Kerj^{P) = {aJ^ : a-P = 0} denote the linear code for which P is a parity-check 
matrix, and note from line ([!]) and definition U] that the probability in question is a function 
of the weight-enumerator polynomial of this code. Specifically, 

F(X = 0) = \WEPi^^,.^^p){ cose, isinO) . (12) 

By varying 6 over the range (0, 7r/2), accurate values of F(X = 0) would enable the recovery 
of the (integral) coefficients of the weight-enumerator polynomial of KerL{P), which by 
choice of P may be set to be any appropriately sized linear binary code we please. The 
recovery of arbitrary weight-enumerator polynomials is P'^'^P^-hard [21j. ■ 

4.2 Background 

There has been a wide range of work into discovering restricted models of quantum com- 
putation which are classically simulable. For example, quantum circuits generating lim- 
ited forms of entanglement, with classical simulations based on analysing matrix product 
states or contracting tensor networks; these circuits have a particularly constrained 'circuit- 
topology', which leads to their simplicity (see [9] for a summary of known results). There 
is no particular circuit-topology imposed in our Z-network architecture (discussed in sec- 
tion [5|), so it seems unlikely that the same methods would apply here. Other positive 
classical simulability results include the stabiliser circuits of the Gottesman-Knill theorem 
and various matchgate constructions (see \20\ [5l [T5| [H] and references therein) . These con- 
structions differ significantly from our Z-networks in terms of the underlying algebra, the 
group generated by the set of allowable gates. 

H0yer and Spalek [4] have shown that Shor's algorithm for Integer Factorisation can indeed 
be performed within a constant number of timesteps on a Graph State processor (dis- 
cussed in section [5|) , though their constructions offer no reason to believe that that constant 
might be smaller than, say, ~ 100; and of course, a general methodology for reducing the 
inherent time-complexity of oracle-dependent quantum search algorithms is known to be 
impossible, due to lower bounds on Grover's algorithm. Dan Browne [2j wrote about CD- 
decomposability, which is the first rigorous treatment that we know of that explicitly links 
Graph State temporal depth with commutativity of Hamiltonian terms used to simulate a 
Graph state computation. 

Dan Simon [17J wrote about algorithms that use nothing more than an oracle and a 
Hadamard transform, and which therefore could be described as 'temporally unstructured'. 
However, his notion of 'oracle' was one tailored for a universal quantum architecture, be- 
ing essentially an arbitrarily complex general unitary transformation, and since there is 
no natural notion of one of these within our 'temporally unstructured' paradigm, there is 
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no real sense in which Simon's algorithm can count as an example of an algorithm within 
the BPP^*^^ framework. In particular, Simon's oracle implements a unitary that does not 
commute with the Hadamard transform. 

4.3 Conjectures, implicit and explicit 

It is possible to form various hardness conjectures about the classical simulation of these 
IQP probability distributions. For a randomly chosen X-program P of a given width n, 
it seems likely that the associated IQP distribution would be exponentially close to flat 
random. Conditioned on its not being random, there is no particular reason to think it 
would be approximately efficiently classically samplable. Here is an example of one such 
conjecture, though the precise details are not important to our arguments. 

Conjecture 2 There exists a distribution V on the set of X-programs, for which no classical 
Turing machine can gain a non-negligible fl{l/poly) advantage in deciding whether or not 
the distribution associated to an X-program chosen randomly from T> is exponentially close 
in trace distance to the uniform distribution. 

This particular hardness conjecture is not quite what we really require, but it gives an 
example of a plausible conjecture about classical simulation, and implies that for almost 
any X-program of interest, there is a certain event (subset of output possibilities) whose 
probability will (probably) be estimated wrongly by your favourite classical polynomial-time 
event-probability-estimating device. (Many similar conjectures sound equally plausible, in 
an area where almost nothing is known for sure.) The point to emphasise in context of our 
two-player interactive protocol games is that it is not unreasonable for Alice to believe that 
Bob can have no classical cheating strategy so long as none such has been published nor 
proven to exist; and so our protocol may still serve as a demonstration (if not a proof) of 
a genuinely quantum computing phenomenon, despite the lack of proof of any simulation 
conjecture. 

Another conjecture implicit in Alice's ability to make a fair hypothesis test — so that Bob 
will indeed have a good chance of passing the test when he does have an IQP oracle (or 
approximate version of one), but will stand little chance of faking a proof if relying on 
guesswork and (known) classical techniques — is one that ensures that Alice's X-programs 
really do incorporate a non-negligible amount of entropy. Although we have little to go 
on besides scant simulation evidence from small examples, we want to make a conjecture 
that collision entropy is close to maximal within at least one relevant family of random 
constant-action X-programs. 

Conjecture 3 The expected collision entropy of the probability distribution of a randomly 
selected X-program of width n, with constant action ir/S, scales as n — 0(1) with the size of 
the program. 

This conjecture is perhaps not directly relevant to the 'hardness' of the IQP paradigm itself, 
but merely relevant to our game construction. Note, for example, that since arbitrary — or 
random — obfuscation rows are used in the construction of the matrix P in the construc- 
tion of section [3l it follows that there will be much about the random variable X that is 
arbitrary — or 'typical' in some vague sense — to the point that if one were sure that the only 



17 



structure of significance were the hidden 'causal' code Cg, one could hope to approximate 
the distribution for X, using knowledge of s, by sampling uniformly at random (no biases) 
and applying a post-filter to create a bias in direction s of the required strength. This gives 
some context for conjecture [3l 

4.4 Classical approximations 

Rather than speculate at this stage on which of the very many possible conjectures may or 
may not be true, we instead turn back to an examination of the mathematical structures 
underpinning the probability distributions in question. 

Suppose we wish to construct a probability distribution that arises from some purely clas- 
sical methods, which can be used to approximate our IQP distribution. Our motivation 
here is to check whether any purported application for an IQP oracle might not be effi- 
ciently implemented without any quantum technology. We proceed using the relatively ad 
hoc methods of linear differential cryptanalysis. 

4.4.1 Directional derivatives 

For the case 6 = vr/8, we will need to consider only second-order derivatives. The same 
sort of method will apply to the case 9 = using dth order derivatives, but the 

presentation would not be improved by considering that general case here. 

In terms of a binary matrix/X-program P, proceed by defining 

/ : ^ Z/16Z, 

/(a) ^ 5;(-l)P-"" (mod 16), (13) 
peP 

and notate discrete directional derivatives as 

/d(a) = /(a)-/(aed) (mod 16). (14) 

Consider also the second derivatives of /, given by 

/d,e(a) = /e(a) - /e(a©d) (mod 16) 

^ 2 (-l)P-" (l - (-l)P-^") (mod 16) 
pePe 

= 4 ^ (-l)P'^^ (mod 16) 

pGPdOPe 

= 4 ^ n ( ^ ~ 

pePdHPe j.Pj=l 

= Yl 4 + 8 E 16), (15) 

pGPdHPe \ j : Pj=l J 

each of which is quite patently a linear function in the bits (ai, . . . , a„) of a, as a function 
with codomain the ring Z/16Z, regardless of the choice of directions d,e. 



18 



Lemma 5 With f defined as per line U3\) . and X the random variable of lemma d for 
all s, 

IT 



F(X-s^ =0) 



cos 



,2 



/s(a) 



(16) 



and so the IQP probability distribution (in the case = tt/8) may be viewed as a function 
of f rather than as a function of P. 

Proof : Starting from the proof of theorem [H hne (|28p . 



F(X • = 0) 



1 + 



1 + Ee 



expf^(/(a)-/(aes)) 



vr 



cos - • /s(a) 



(17) 



The second hne above is obtained immediately from the first, using the definition of /. The 
third hne fohows because the expression is real-valued. The conclusion follows from a basic 
trigonometric identity, and linearity of the expectation operator. ■ 

And so if there is a hidden s such that F(X • S"^ = 0) = 1, then that implies /s(a) = 
(mod 16) for all a. This is essentially a non-oracular form of the kind of function that arises 
in applications of Simon's Algorithm [T7j, with s playing the role of a hidden shift. One 
could find linear equations for such an s if it exists, because it would follow immediately 
that /d,e(s) = /d,e(0) for any directions d, e, which is by line ()14p equivalent with 



E p 

.pePdHPe 



0. 



(18) 



4.4.2 Classical sampling 

To make use of this specific second-order differential property, we need to analyse the 
probability distribution that a classical player can generate efficiently from it. Proceed by 
defining a new probability distribution for a new random variable Y, as follows : 



F(Y 




(19) 



This may be classically rendered, simply by choosing d, e € F2 independently with a 
uniform distribution, and then returning the sum of all rows in P that are not orthogonal 
to either d or e. 



Lemma 6 The classical simulable distribution on the random variable Y defined in line hi 9^) 
satisfies 

F(Y-s^ = 0) = F(cf-C2 = I ci,C2~Cs) (20) 
= 1 1 + 2-™"'=( ^= ) ) , (21) 
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and so the bias ofY in direction s is a function of the matroid Ps- 
Proof : Starting from line (jl9p . 



F(Y . = 0) = Yl ^d,e E P = y I (22) 

y : y sT=0 \ pePdOPe 

= Fd,e E p • = 

\ pGPdHPe 

= Fd,e {wt{P-d^ A P-e^ A P ■ ) = (mod 2) ) 
= Fd,e {wt{P^-d^ A Ps • ) = (mod 2) ) 
= Fd,e(d-Pi^-Ps-e^ = 0). 

The wedge operator A here denotes the logical AND between binary column- vectors. 

The first line of the lemma follows from the obvious substitutions ci = Pg ■ d^, C2 = Pg ■ 
The second line follows because unimodular actions on the left or right of a quadratic form 
(such as (Pg^ ■ Ps)) affect neither its rank nor the probabilities derived from it; so it suffices 
to consider the cases where it is in Smith Normal Form, i.e. diagonal, which are trivially 
verified. Since this expression is patently invariant under invertible linear action on the 
right and permutation action on the left of Pg, it too is a matroid invariant. ■ 

4.4.3 Inter-relation 

Thus we have established some kind of correlation between random variables X and Y. 

Theorem 2 In the established notation, for X-programs with fixed 6 = tt/S, 

F(X . = 0) = 1 F(Y . = 0) = 1. (23) 

Proof : By theorem [H the antecedent gives, for all c G Cg, rig = 2wt{c) (mod 8), where 
Us is again the length of the code Cg. This entails that every codeword in Cg has the same 
weight modulo 4, including the null codeword, so Cg must be doubly ever0. It is easy to 
see that doubly even linear codes are self-dual|j By lemma [H the consequent is obtained. 
■ 

The only counterexamples to the converse implication seem to occur in the trivial cases 
whereby the binary matroid Pg has circuits of length 2, i.e. where Pg has repeated 
rows. 

Note that if F(X • = 0) were equal to 1 precisely, then by making a list of samples 
from IQP runs, storing them in a matrix, and performing Gaussian Elimination to recover 
the kernel of the matrix, it would be straightforward to compute the hidden s. However, 



^Doubly even just means that every codeword has weight a multiple of 4. 

^Self-duality just means that the dual code — consisting of all words orthogonal to every codeword — is 
equal to the code. 
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theorem [2] shows that this is exactly the condition required for being able to compute s 
via purely classical means. For this reason, it seems hard to find decision languages that 
plausibly he in BPP^QP\BPP. 

This random variable Y is the 'best classical approximation' that we have been able to 
find for X. (The intuition is that it captures all of the 'local' information in the function 
/, which is to say all the 'local' information in the matroid P, so that the only data left 
unaccounted for and excluded from use within building this classical distribution is the 
'non-local' matroid information, which is readily available to the quantum distribution via 
the magic of quantum superposition.) There seems to be no other sensible way of processing 
P (or /) classically, to obtain useful samples efficiently, though it also seems hard to make 
any rigorous statement to that effect. 

Conjecture 4 The classical method defined in this section, yielding random variable Y, 
is asymptotically classically optimal (when comparing worst-case behaviour and restricting 
to polynomial time) for the simulation o/ IQP distributions arising from constant- action 
9 = ir/S X-programs. 

This conjecture lends credence to the design methodology of section [3l 
4.5 Future work 

We might also recommend the further study of matroid invariants through quantum tech- 
niques, or perhaps the invariants of weighted matroids, since they seem to be the natural 
objects of IQP computation as hitherto circumscribed. This would seem to be fertile ground 
for developing examples of things that only genuine quantum computers can achieve. 

Note that if it weren't for the correlation described in theorem [21 then it would be possible to 
conceive of a mechanism whereby an IQP-capable device could compute an actual secret or 
witness to something {e.g. learn s), so that the computation wouldn't require two rounds of 
player interaction to achieve something non-trivial. Yet as it stands, it is an open problem to 
suggest tasks for this paradigm involving no communication nor multi-party concepts. 

5 Architectures 

In this section we sketch two more architectures for implementing IQP oracles. These 
architectures are probably more physically feasible than X-programs. 

5.1 Z-networks 

The network- or circuit-model of computation is perhaps the most familiar one. Programs 
for the first architecture we call "Z-networks", since the program is most easily described 
as a network of gates on an array of qubits, where the allowed gate-set includes just the 
Controlled-Not gate from any qubit to any qubit and the single-qubit gate that implements 
the Pauli Z HamiltoniarH for some time. Although this Z-network architecture does have 

= 10)(0 I - I I, on a single qubit. 
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a notion of temporal structure — because it is important the order in which the gates of the 
network are carried out — nonetheless it is useful for our analysis because it turns out to 
have effectively the same computational power as the X-program architecture under some 
basic assumptions, and the Lie group structure underpinning the kinds of transformation 
allowable within the Z-network architecture is particularly easy to work with. 

On the understanding that n qubits are initialised into | ) in the computational basis and 
ultimately measured in the computational basis, it is well known that the gate-set consisting 
of Controlled-Not gates together with all single-qubit rotations is universal for BQP, and 
the Lie group generated by this gate-set generates the whole of SU{2'^), modulo global 
phase. However, by the term "Z-network" we mean explicitly to limit the single-qubit gates 
to being those which implement e*^^ for some action 9, so that the Lie group spanned by 
the gate-set (represented in the computational basis) consists of unitary matrices that are 
supported by permutation matrices, i.e. those unitaries that have just one non-zero entry 
per row. Any such unitary can be factored into a diagonal matrix followed by a permutation 
matrix. (In all cases, global phase is to be regarded as physically irrelevant, and may be 
'quotiented out' from the groups in question.) 

We can describe groups by giving generator sets for them. The group containing all even 
permutations and all diagonal elements (modulo global phase) is 



This 'qualifies' as a Z-network group; indeed, all Z-network groups are to be a subgroup of 
this one. But for the purposes of comparison with X-programs and the IQP paradigm, it 
suffices to consider the much simpler group given by 



This latter group does not apparently contain (efficiently) the dynamics of classical compu- 
tation. (The X gate is necessary to enable the construction of all diagonals, but one might 
prefer to replace implementation of X by the availability of an ancilla 1 1 ) qubit, so that a 
C-Not gate can simulate an X gate. In the language of complexity theory, (j24p might be 
said to stand in relation to (125]) as P stands to 0L.) 

One can see how to build a variety of constructions within the group at line (I25p , using the 
specified gate-set. E.g. by conjugating e*^^^ with two C-Not gates one can create an e^^^i^^ 
composite unitary, and similarly g^^^i^^^s ^g^^ |-,g created, etc. 

5.1.1 Reductions between Z-networks and X-programs 

This neat mathematical structure {i.e. as at line (|25p l enables probability distributions of 
the kind at line ([T]) to be simulated. 

Lemma 7 A Z-network can always be designed to simulate an X-program efficiently. 

Proof : Simply by initialising the input qubits to the Z-network to | +" ) in the Hadamard 
basis, and measuring output in the same Hadamard basis, the simulation of a given X- 
program, P, proceeds by constructing a small network of CNot gates and an exp{i9pZ) 
gate for each row p of P. Details omitted for brevity. ■ 




(24) 




(25) 
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Conversely, 

Lemma 8 An X-program can efficiently simulate a given Z-network, provided that that Z- 
network uses Hadamard-basis input, C-Nots, X gates, and e'^^ gates only, and outputs in 
the Hadamard basis. 

Proof : The required reduction just associates one X-program element {9p,p) to each e*^^ 
gate, setting Op *^ 6 and specifying p according to the location of the e*^^ gate and the 
totality of C-Not gates to the left of that gate. A final piece of simple post-processing 
is needed after the measurement phase of the X-program, to account for the C-Nots in 
the Z-network, but this post-processing simply consists in applying the same C-Nots (with 
directions reversed) on the classical measurement outcomes. (This is because moving from 
the Hadamard basis to the computational basis has the effect of reversing the direction of 
C-Not gates.) ■ 

The point of these reductions is to highlight the sense in which the group associated to 
simple Z-networks stands in the same relation to the set of X-programs as the 'full' SU{2^) 
Lie group stands to the set of proper full-blown quantum algorithms. 



5.2 Graph-programs 

Programs for the second of these architectures we call "Graph-programs", since the pro- 
gram is most easily described as the construction of a graph state followed by a series of 
measurements of the qubits in the graph state in various bases [2J. Graph state comput- 
ing architectures are popular candidates for scalable fully universal quantum processors 
[131 [12]. Here we are concerned not with universal architectures, but with the appropriate 
restriction to 'unit time' computation. So, unlike universal graph state computation, our 
Graph-programs do not admit adaptive feed-forward, which is to say that all measurement 
angles must be known and fixed at compile-time, so that all measurements can be made 
simultaneously once the graph state has been built. In this sense, the 'depth' of a Graph- 
program is 1. A graph state has qubits that are initially devoid of information, but which 
are entangled together according to the pattern of some pre-specified graph. A graph state 
can be constructed without inherent temporal complexity, perhaps even prepared in a single 
computational time-step, because there is no implicit reason requiring one edge of the graph 
to be prepared before any other. (It is still fair to argue that the circuit-depth of the process 
that generates a graph state is linear in the valency of the graph, but that is not a measure 
of 'inherent' temporal complexity.) We will show how Graph-programs can simulate the 
output of X-programs if a little trivial classical post-processing of the measurement results 
is allowed. 

A Graph-program is taken to be an undirected (usually bipartite) graph with labelled and 
distinguished vertices. The vertex set is denoted V, of cardinality n, and for each v (z V 
there is an element of SU{2) labelling it; R^j G SU(2). The edge set is denoted E. To 
implement the program, a qubit is associated with each vertex and is initialised to the state 
I -|- ) in the Hadamard basis. Then a Controlled-Z Pauli gate is applied between each pair 
of qubits whose vertices are a pair in E. Since these Controlled-Z gates commute, they may 
be applied simultaneously, at least in theory. Finally, each vertex qubit v is measured in 
the direction prescribed by its label R^, returning a single classical bit. Clearly the order of 
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measurement doesn't matter, because the measurement direction is prescribed rather than 
adaptive. Hence a sample from F2 (a bit-string) is thus generated as the total measurement 
result. 

Combining this together, we see that the probability distribution for such an output is 



F(X = x) 



np "TT 1 + Zu -\- — ZuZy I „ , 
vGV {u,v)gE 



(26) 



Here the measurement has been written using the notation of the computational basis, with 
an appropriate (passive) rotation immediately prior. 



5.2.1 Graph-programs and X-programs 

Having described three architectures, we've indicated that the X-programs, characterised 
by the formulation at line ([1]), are in some natural sense the 'lowest common denominator' 
amongst the architectures (and unitary groups) of interest. 

Lemma 9 A Graph-program can always be designed to simulate an X-program efficiently. 

Proof : Suppose we're given an X-program, written { {6p,p) : p € P C }• Then it 
is straightforward to simulate it on a Graph State architecture, as follows. Let V be the 
disjoint union of [l..n] and P, so that the graph state used to simulate the program will have 
one primal qubit for each qubit being simulated, plus one ancilla qubit for each program 
element p, the total cardinality of V being polynomial in n, by hypothesis. Let (j, p) G 
exactly when the jth component of p is a 1. In this way, the resulting graph is bipartite, 
linking primal qubits to those ancilte that they have to do with. Let Rj be the Hadamard 
element {H) for all primal qubits, so that all primal qubits are measured in the Hadamard 
basis. Let Rp = exp(z0pX), so that every ancilla qubit is measured in the (yZ)-plane at 
an angle specified by the corresponding program element. 

If the resulting Graph-program is executed, it will return a sample vector x S W^'^'^^ for 
which the n bits from the primal qubits are correlated with the #P bits from the ancillae in a 
fashion which captures the desired output, (though these two sets separately — marginally — 
will look like flat random data.) To recover a sample from the desired distribution, we simply 
apply a classical Controlled-Not gate from each ancilla bit to each neighbouring primal bit, 
according to E, and then discard all the ancilla bits. One can use simple circuit identities 
to check that this produces the correct distribution of line ([T|) precisely. ■ 



5.3 Physical comparison 

An X-program uses only 'one timeslice', but presumes an arbitrarily large gatespan (inter- 
action length), up to n, the number of qubits in the X-program. 

The Z-network reduction is physically preferable because it uses small gates (gatespan = 2) 
to simulate an X-program. However, it has network depth that is not constant; rather, it 
is likely to be quadratic in n, in general, unless one is careful to make optimisations when 
'compiling' the network. 
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The Graph-program reduction uses more qubits than n, but has better depth properties 
than the Z-network architecture. Again, only smah gates are used. It seems natural to 
suggest that the 'true temporal cost' of implementing a Graph program would either be 
one process for measurement plus either one process or degree processes for initialising the 
graph state, where degree denotes the vertex-degree of the underlying graph. The number 
of qubits required in the simulation of an X-program is also slightly larger, since ancilla 
qubits are used. The worst case in our interactive protocol application would therefore 
require about 5q/2 qubits, rather than q/2, though the vertex-degree would no bigger than 
the number of qubits. The kinds of graph called for in our Graph-program reduction are not 
the usual cluster state graphs (regular planar lattice arrangement) used in measurement- 
based quantum computation. The bipartite graphs described in the reduction will usually 
be far from planar for generic X-programs, having a relatively high genus. This means that 
the graph cannot be 'laid out' on a plane without edges crossing. 
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Appendix 

A proof of theorem [H Throughout, the variable p ranges over the rows of the binary matrix 
P, which are the program elements of an X-program. 

Derive line ([8]) from line ([T]) in the case that the value 6 is constant. 



F(X = x) 



^ x I exp 



X, 1 10" 



2-"^(_l)xa-^^| ,^p| J2i9^ (g) Z, Y.\^' 

P i:Pj=l / b 

2 



E 



a,d 



(-l)''-^^ exp i^^(-l)P-^^ 
(-l)-'^" exp ( f0^(-l)P-"(l - (-l)P d^ 



(27) 



On the second line we made a change of basis, so as to replace the Pauli X operators with 
Pauli Z ones. 
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F(X • = 0) = 2" Ex [ {x • = 0} • F(X = x) 



T E 



a,d,x 



(1 + (-1)--^ ) 



2" Ea,d 



{d = 0} + {d = s}j j^^(_i)p a^ 



2 1 + 



ieEp(-i)P-"^(i-(-i)P"^ 



These transformations are conceptually simple but notationally untidy. 



2 • F(X • = 0) - 1 = ^ e^^'^ Ea,,/ 



+ Ep(-i)P-"'^(i-{-i)P"^ 



(28) 



= ^e^^-^pj J = 2 J] (-l)P-' 

i \ p : p sl' = l 

= J^e^^'^ F(i = 2(n,-2-ii;t(c) ) [ c - ) 

i 

= ^ cos( 26l(ns -2w))-V{w = wt{c) | c ~ Cg ) . (29) 

w 

Here we have used the standard Fourier decomposition of a periodic function, and used the 
fact that the function is known to be real. The variable substitution at the third line was 
c = Pg • understood in the correct basis. At the fourth line it was w = (2ns —j) /4. 



Us 

F(X-s^ = 0) = ^cos2( 6'(ns-2u') ) •F( w; = u't(c) | c ~ Cg 

10=0 

= Ec^Cs [ cos^{ - 2-wt{c)) 
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